Computing Partitions within SQL Queries: A Dead End?
نویسندگان
چکیده
The primary goal of relational databases is to provide e cient query processing on sets of tuples and thereafter, query evaluation and optimization strategies are a key issue in database implementation. Producing universally fast execution plans remains a challenging task since the underlying relational model has a significant impact on algebraic definition of the operators, thereby on their implementation in terms of space and time complexity. At least, it should prevent a quadratic behavior in order to consider scaling-up towards the processing of large datasets. The main purpose of this paper is to show that there is no trivial relational modeling for managing collections of partitions (i.e. sets of sets). In the withheld case, we show that one could not express all the operators of the partition lattice and set-theoretic operations of the algebra of sets (viewing blocks as elements) within FO, and consequently as queries of the relational algebra (RA). We also show multiple evidence of ine ciency of RA-expressible operators and an alternative which warrant another computational model. Further, we present some experimental results that enforce this evidence and conclude that R-DBMS are inadequate for partition querying. Hence, we claim that there is a strong requirement for the design of an ad hoc system to manage partitions or at least to supplement an existing system on which both data persistence and transaction management could be delegated. ha l-0 07 68 15 6, v er si on 1 20 D ec 2 01 2
منابع مشابه
A First Attempt to Computing Generic Set Partitions: Delegation to an SQL Query Engine
Partitions are a very common and useful way of organizing data, in data engineering and data mining. However, partitions currently lack efficient and generic data management functionalities. This paper proposes advances in the understanding of this problem, as well as elements for solving it. We formulate the task as efficient processing, evaluating and optimizing queries over set partitions, i...
متن کاملParGRES: a middleware for executing OLAP queries in parallel
ParGRES is a middleware aimed to efficiently process heavy weight queries, typical of OLAP, on top of a database cluster. ParGRES achieves query processing speed-up through intraand inter-query parallelism in a PC cluster environment with database replication and virtual partitioning. It accelerates both individual queries and system throughput. Our experimental results show that ParGRES yields...
متن کاملTowards Evaluation of a Symmetric XPath Axis in a Tree-Unaware RDBMS
xml query languages use asymmetric path expressions to locate data in an xml data collection. They are tightly coupled to the structure of a data collection, and can fail when evaluated on the same data in a different structure. This paper extends path expressions with a new symmetric axis, the closest axis, that contains closest nodes to the context node within a specified distance in any dire...
متن کاملTowards Practical Differential Privacy for SQL Queries
Differential privacy promises to enable general data analytics while protecting individual privacy, but existing differential privacy mechanisms do not support the wide variety of features and databases used in real-world SQL-based analytics systems. This paper presents the first practical approach for differential privacy of SQL queries. Using 8.1 million real-world queries, we conduct an empi...
متن کاملThe Effects of Information Request Ambiguity and Construct Incongruence on Query Development
This paper examines the effects of information request ambiguity and construct incongruence on end user’s ability to develop SQL queries with an interactive relational database query language. In this experiment, ambiguity in information requests adversely affected accuracy and efficiency. Incongruities among the information request, the query syntax, and the data representation adversely affec...
متن کامل